Beyond Sequential Covering - Boosted Decision Rules
نویسندگان
چکیده
From the beginning of machine learning, rule induction has been regarded as one of the most important issues in this research area. One of the first rule induction algorithms was AQ introduced by Michalski in early 80’s. AQ, as well as several other well-known algorithms, such as CN2 and Ripper, are all based on sequential covering. With the advancement of machine learning, some new techniques based on statistical learning were introduced. One of them, called boosting, or forward stagewise additive modeling, is a general induction procedure which appeared to be particularly efficient in binary classification and regression tasks. When boosting is applied to induction of decision rules, it can be treated as generalization of sequential covering, because it approximates the solution of the prediction task by sequentially adding new rules to the ensemble without adjusting those that have already entered the ensemble. Each rule is fitted by concentrating on examples which were the hardest to classify correctly by the rules already present in the ensemble. In this paper, we present a general scheme for learning an ensemble of decision rules in a boosting framework, using different loss functions and minimization techniques. This scheme, called ENDER, is covered by such algorithms as SLIPPER, LRI and MLRules. A computational experiment compares these algorithms on benchmark data.
منابع مشابه
VC-DomLEM: Rule induction algorithm for variable consistency rough set approaches
We present a general rule induction algorithm based on sequential covering, suitable for variable consistency rough set approaches. This algorithm, called VC-DomLEM, can be used for both ordered and non-ordered data. In the case of ordered data, the rough set model employs dominance relation, and in the case of non-ordered data, it employs indiscernibility relation. VC-DomLEM generates a minima...
متن کاملBoosted SVM for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients
In this paper, we present boosted SVM dedicated to solve imbalanced data problems. Proposed solution combines the benefits of using ensemble classifiers for uneven data together with cost-sensitive support vectors machines. Further, we present oracle-based approach for extracting decision rules from the boosted SVM. In the next step we examine the quality of the proposed method by comparing the...
متن کاملSequential Optimization of γ-Decision Rules
The paper is devoted to the study of an extension of dynamic programming approach which allows sequential optimization of approximate decision rules relative to length, coverage and number of misclassifications. Presented algorithm constructs a directed acyclic graph ∆γ(T ) which nodes are subtables of the decision table T . Based on the graph ∆γ(T ) we can describe all irredundant γ-decision r...
متن کاملA heuristic covering algorithm has higher predictive accuracy than learning all rules
The induction of classification rules has been dominated by a single generic technique—the covering algorithm. This approach employs a simple hill-climbing search to learn sets of rules. Such search is subject to numerous widely known deficiencies. Further, there is a growing body of evidence that learning redundant sets of rules can improve predictive accuracy. The ultimate end-point of a move...
متن کاملOptimal sequential procedures with Bayes decision rules
In this article, a general problem of sequential statistical inference for general discrete-time stochastic processes is considered. The problem is to minimize an average sample number given that Bayesian risk due to incorrect decision does not exceed some given bound. We characterize the form of optimal sequential stopping rules in this problem. In particular, we have a characterization of the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010